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SOME  STATISTICAL  PROBLEMS  OF  DIAGNOSIS  BY  MEANS 
OF  A  MULTIPHASIC  SCREENING  EXAMINATION 

This  research  report  is  written  at  an  early  stage  in  the  revision  and  ex¬ 
pansion  of  the  multiphasic  physical  examination  of  the  Permanente  Medical  Group. 
The  concept  of  multiphasic  screening  in  medicine  refers  to  a  series  of  tests 
performed  to  determine  whether  there  is  sufficient  likelihood  of  certain 
diseases  being  present  to  warrant  further  testing  for  these  diseases.  Multi¬ 
phasic  screening  examinations  must  be  suitable  for  routine  application  to  a 
large  number  of  patients,  as  well  as  comprehensive,  accurate,  and  efficient. 

The  problem  is  to  determine  the  type  of  data  to  be  obtained  from  each 
patient  and  a  process  by  which  a  diagnostic  decision  is  to  be  reached  which  will 
be  suitable  for  handling  by  an  electronic  computer.  This  report  is  intended  to 
outline  a  statistical  formulation  of  the  problem,  to  provide  references  to 
previous  studies  in  related  areas,  to  point  out  several  statistical  problems 
to  be  anticipated  at  various  stages  in  the  development  of  such  a  program,  and 
to  summarize  a  portion  of  the  ideas  set  forth  by  the  committee  in  charge. 

The  approach  which  is  suggested  in  this  paper  involves  determining  a  set 
of  diseases  and/or  disease  classes  and  for  each  of  these  classes  a  set  of 
diagnostic  questions  and  tests,  each  having  a  positive  or  a  negative  response. 

On  the  basis  of  a  patient's  responses  to  these  tests  he  will  be  either  dis¬ 
missed  as  free  from  these  diseases  or  referred  for  further  diagnostic  study 
in  one  or  more  of  the  disease  classes.  The  number  of  classes  must  be  large 
enough  to  render  the  examination  worthwhile,  yet  it  must  be  subject  to  the 
limitations  that  cost  and  convenience  place  upon  the  type  and  upon  the  total 
number  of  questions  and  tests  that  are  feasible  for  a  multiphasic  examination. 


The  problem  becomes  statistical  in  as  much  as  few  signs  or  symptoms,  if 
any,  have  the  property  of  always  occurring  when  a  particular  disease  is  present 
and  never  occurring  when  it  is  not.  In  view  of  the  basic  approach  to  the 
examination  as  outlined  above,  it  seems  possible  to  consider  a  sequence  of  two- 
decision  problems,  or  problems  in  testing  one  hypothesis  against  another.  Sup¬ 
pose,  then,  there  is  to  be  established  a  set  of  m  disease  classes,  > 

with  a  set  of  simple  hypotheses  versus  simple  alternatives,  H.  :  Patient  should 
be  examined  more  thoroughly  with  regard  to  disease  class  ,  versus  :  Patient 
does  not  show  signs  of  a  disease  in  class  ,  i  ■  1,2, ...,m  .  Hereafter,  the 
notation  shall  be  %  :  ,  vs.  :  C^. 

For  each  of  the  m  hypotheses,  a  set  of  diagnostic  questions  and  tests 
must  be  determined  to  yield  a  diagnostic  vector  ■  (X^,  Xi2'  •  •  •  >  xin  )  •  ®ie 

X.  .  can  assume  the  values  zero  or  one  according  as  the  corresponding  symptom  is 

ni 

absent  or  present.  Hence  there  are  2  theoretically  possible  points  X.^  . 

It  is  assumed  that  the  probability  of  observing  a  point  x^  for  a  patient  in 

C.J1  is  different  from  that  for  a  patient  having  a  disease  in  class  }  l.e., 

Pr(xi  |C±)  /  Pr(xi|Ci'L)  )  with  the  possible  exception  of  a  few  x^  .  Hence,  we 

are  able  to  deteimine  a  best  critical  region  for  the  rejection  of  the 

hypothesis  H^  ,  using  the  Neyman-Pearson  fundamental  lemma,  and  to  determine 

the  probabilities  of  error.  The  probability  of  rejecting  when  it  is  in 

fact  true  is  given  by  at  “  Pr(X.  6  C  |C. )  *  I  Pr(x  |C. )  ,  and  the  proba- 

1  1  1  1  x  eC  1  1 

bility  of  accepting  H^  when  it  is  in  fact  faise^is  (1  -  p^)  ■  Pr(X^.^  = 

■  1  -  Z  Pr(x.  IcT1)  .  The  region  £  is  best  in  the  sense  that  subject  to 
xie^i 

ai  £  ,  where  is  some  maximum  tolerable  error  for  H^  and  we  may  have 

a|  =»  a  for  all  i  ,  maximizes  p.^  .  The  set  of  symptoms  to  be  observed 

must  be  determined  so  that  p^  >  pj  ,  where  pj  is  some  minimum  power  for  the 
test  of  and,  again,  we  may  have  p^  ■  p  for  all  i  ,  In  addition  to 
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satisfying  this  minimum  power  requirement,  the  set  of  symptoms  to  he  observed 
must  satisfy  the  restrictions  discussed  with  respect  to  maximizing  the  number 
of  classes. 

Estimates  of  all  probabilities,  Pr(xi)Ci)  and  Pr(x  |Cj^)  for  all  possible 
observations  x^  and  for  i  a  1,2,  ...,m  ,  are  required  and  must  be  obtained  for 
a  large  sample  of  the  population.  Since  we  require  probabilities  of  the  various 
configurations  in  the  populations  of  patients  belonging  to  each  of  the  m  classes, 
C±  ,  and  in  the  populations  which  are  the  complements,  c"1  ,  of  these  m  classes, 
it  is  imperative  that  there  be  other  methods  of  diagnosis  ,  whose  verdict  is 
taken  to  be  correct,  but  which  may  be  considered  infeasible  for  inclusion  in 
the  multiphasic  screening  examination.  In  such  instances,  necessary  follow-up 
to  provide  supplemental  diagnostic  examinations  must  be  made  on  all  persons  in 
the  saiiqple  to  give  a  final  diagnosis  and  classification  into  the  various  disease 
classes.  Because  of  the  time  lag  necessary  to  achieve  this  final  diagnosis 
and  the  low  prevalence  of  many  diseases  to  be  considered,  it  seems  reasonable 
that  observations  may  have  to  be  taken  not  only  from  the  proposed  multiphasic 
examination  population  but  also  from  the  populations  consisting  of  previously 
diagnosed  cases.  Studies  will  also  be  required  on  the  homogeniety  of  the 
population  to  which  the  examination  is  to  be  given  to  determine  whether  such 
factors  as  age' will  require  initial' classification  of  the  population  before 
classification  according  to  final  diagnosis  in  obtaining  estimates  of  the 
probabilities. 

Inhere  is  also  assumed  a  standardization  and  quantification  of  the  various 
symptoms.  For  example,  more  information  is  required  from  the  patient  question¬ 
naire  than  "Have  you  had  a  recent  unexplained  weight  loss?"  How  much  weight 
lost  over  what  period  of  time  might  be  a  reasonable  inquiry  where  certain  combi¬ 
nations  of  time  and  amount  might  be  considered  significant.  Furthermore,  even 
on  symptoms  which  are  necessarily  quantitative,  a  value  must  be  determined  to 


-3- 


delineate  normal  and  abnormal  and  hence  to  indicate  when  a  symptom  measurement 
is  negative  (X^j  ■  0)  and  when  it  is  positive  (X^  =  1  ). 


Having  established  the  m  hypotheses  :  C^, :  Cm 


with  corres¬ 


ponding  alternative  :  C^1, ...,K^  :  ,  critical  rejection  regions 

*  significance  levels  aj_,...,Q^  ,  and  powers  ^,...,(3m  >  consider 
simultaneous  testing  of  these  m  hypotheses.  There  are  2m  theoretically 
possible  true  situations  and  2m  sequences  of  decisions  resulting  from  testing 
the  m  hypotheses.  Denote  by  0 ^  »  $±±>$±2*  *  *  4,^im)  }  i  ,  the 

m  K\ 

possible  m-vectors  consisting  of  l's  and  -I's  ,  and  by  Hj  :  H  C  J  the  com- 
posite  hypothesis  that  those  hypotheses  Hj  for  which  0^  ^  =  1  are  true  and 
those  for  which  0  =  -1  are  false.  Then  H'  is  to  be  accepted  if  and  only 

if  each  with  0  =  1  is  accepted  and  each  with  0^  =  -1  is  re¬ 

jected.  For  example,  let  m  =  2  ,  0^  =  (l,-l)  so  that  0^  =»  1  and  0^^  =  -1 
then  :  C^G^1  denotes  the  hypothesis  that  the  patient  belongs  to  class 

but  not  to  class  Cg  ,  and  we  would  accept  if  and  only  if  we  accepted 


H^  :  and  rejected  Hg  :  C9  .  Ideally,  one  would  want  estimates  of 

—  m  0 

H  C ^  )  for  each  of  the  2  disease  states 

.  j«L  J 

and  critical  regions  of  rejection  for  each  of  the  2  (d  -  l)/2  cases 


PrCXu,...,^ 


HJ 


1,1  ^4  “  0k1 

n  c,~J  vs.  h£  :  n  c,  J 


In  view  of  the  magnitude  of  the  minimal 


j=d  "  j=d.  * 

number,  m  ,  of  classes,  the  above  approximation  is  suggested,  even  though  in 

practical  application,  prevalence  considerations  would  allow  reduction  of  the 

number  of  disease  class  combinations  to  be  considered. 

The  probabilities  of  error  on  each  hypothesis  H,  will  eventually  be 

tJ 

known,  but  there  is  still  the  question  of  the  total  probability  of  error  when 
the  m  hypotheses  are  compounded.  Suppose  that  the  m  two-decision  problems 
are  independent.  The  probability  of  accepting  when  H£  is  actually  true 
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becomes 


Pr(H£|H£) 


m 

n  (i 

j=i 


(i+0  )(i+0  )A 

a,)  iJ  (1 


(1+0. 


ij^)A 


(i-^)(i^)A  a-0i<3)a-4j)A 

aj  pj 


In  particular,  the  probability  of  a  correct  decision  is 


Pr(Hj|HJ) 


m  (1+0  )/2  (1-0  )/2 

n  (1  -  cO  ij  p.  i>5 

j=l  J  J 


Defining  a  healthy  person  as  one  for  whom  all  m  hypotheses  are  false. 


0ji_  j  3  "1  >  J  *  1»  •  •  •  im 


Pr 


m 

n 

J-l 


m 

n 

j=l 


pJ 


> 


so  that  the  probability  of  doing  further  study  in  at  least  one  disease  class 
for  a  person  who  is  healthy  is 


1  -  Pr 


/  ra  -i  m  -A 


m 

i  -  n  p 

J*! 


J 


This  last  probability  is  a  special  case  of  the  probability  of  at  least  one 

m  (1+0  )/2 

wrong  decision  when  H'  is  true,  1  -  Pr(H’|H' )  »  1  -  II  (l  -  a.)  J 

(l-0itJ)/2  W 
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Further  study  is  required  to  determine  conditions  under  which  the  m  tests 
will  be  independent  as  well  as  to  determine  whether  such  conditions  will  be 
satisfied  in  this  application.  In  the  case  it  is  not  reasonable  to  assume  in¬ 
dependent  decisions  on  each  of  the  hypotheses  H.  ,  then  the  whole  question  of 

J 

total  error  when  compounding  probabilities  remains  unanswered.  Further  study 
and  discussion  are  necessary  also  l)  to  provide  a  systematic  and  statistically 
valid  method  for  selecting  the  symptoms  to  be  observed  for  each  disease  class 
2)  to  determine  the  additional  error  induced  in  the  decisions  by  the  fact  that 
we  know  only  random  estimates  of  the  desired  parameters,  j)  to  estimate  the 
total  probability  of  a  wrong  decision  on  each  test  vs,  ;  i.e,, 

Pr(Xi  6  C±  |C±)  Pr(C.)  +  Er(Xi  t  -  Pr^)]  =  +  (l  -  p±)(l  -  p±) 

where  is  the  prevalence  of  disease  class  Cl  . 

The  reader  is  referred  to  the  following  papers  which  have  been  helpful  in 
reaching  the  formulation  of  the  problem  as  presented  in  this  report.  The  theory 
of  testing  a  simple  hypothesis  against  a  simple  alternative  is  treated  in  the 
textbook  by  Neyman  jj?J,  and  the  examples  include  a  discussion  of  screening  for 
tuberculosis.  The  studies  of  Chiang  [5],  Neyman  [8],  Taylor  fio],  and 
Yerushalmy  [12J,  point  out  interesting  problems  in  obtaining  estimates  of  pro¬ 
babilities  in  the  field  of  public  health  as  well  as  illustrate  the  theory  of 
estimation  due  to  Neyman  [7].  The  excellent  paper  by  Chiang,  Hodges,  and 
Yerushalmy  [4],  discusses  in  a  very  general  way  several  applications  of  statistics 
to  medical  diagnosis.  The  two  papers  on  the  problem  of  classification  ~  Anderson 
[l],  and  BLrnbaum  (3"]  —  are  pertinent  to  this  study  since  they  both  involve 
deciding  from  which  population  a  person  comes  on  the  basis  of  his  vector 
(X^, . . .,X^)  of  observed  symptoms  or  traits.  The  two-decision  problems  discus¬ 
sed  in  this  report  are  but  a  special  case  of  the  k-decision  problems  considered 
by  Anderson  or  Birnbaum,  Finally,  the  problem  of  generating  a  complex 
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statistical  test  by  simultaneous  consideration  of  more  simple  testing  problems 
is  given  theoretical  treatment  by  Lehmann  in  [6~],  While  his  paper  does  not 
discuss  the  question  of  total  error,  the  works  of  Birnbaum  [2]  and  Wallis  [llj 
do  discuss  error  in  compounding  tests,  although  in  a  different  context  from 
the  application  in  this  study. 
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